Emotion conversion using Feedforward Neural Networks
نویسنده
چکیده
An emotion is made of several components such as physiological changes in the body, subjective feelings, and expressive behaviours. These changes in speech signal are mainly observed in prosody parameters such as pitch, duration and energy. In this work, prosody parameters are modified using instants of significant excitation (epochs) and these instants are detected using Zero Frequency Filtering (ZFF) based method. Epoch locations in the voiced speech corresponds to glottal closure instances, and in the unvoiced region it corresponds to some random instants of significant excitation. Prosody parameters for target emotions are predicted from Hindi emotional speech database. In this work, anger and sad emotions are considered as target emotions in the proposed emotion conversion framework. Feedforward neural network models are explored to predict the prosody parameters. Predicted Prosody parameters at syllable level are incorporated into neutral speech to produce the desired emotional speech. After incorporating the emotion specific prosody, perceptual quality of the transformed speech is evaluated by listening tests. Keywords-emotion conversion; zero frequency filtering; pitch; duration; glottal closure instance; feedforward neural network
منابع مشابه
MediaEval 2015: Music Emotion Recognition based on Feed-Forward Neural Network
In this paper, we describe the music emotion recognition system named as JU_NLP to find the dynamic valence and arousal values of a song continuously considered from 15 second to its end in an interval of 0.5 seconds. We adopted the feed-forward networks with 10 hidden layers to build the regression model. We used the correlation-based method to find out suitable features among all the features...
متن کاملMulti-View Face Detection in Open Environments using Gabor Features and Neural Networks
Multi-view face detection in open environments is a challenging task, due to the wide variations in illumination, face appearances and occlusion. In this paper, a robust method for multi-view face detection in open environments, using a combination of Gabor features and neural networks, is presented. Firstly, the effect of changing the Gabor filter parameters (orientation, frequency, standard d...
متن کاملبررسی کارایی روشهای مختلف هوش مصنوعی و روش آماری در برآورد میزان رواناب (مطالعه موردی: حوزه شهید نوری کاخک گناباد)
Rainfall-runoff models are used in the field of hydrology and runoff estimation for many years, but despite existing numerous models, the regular release of new models shows that there is still not a model that can provide sophisticated estimations with high accuracy and performance. In order to achieve the best results, modeling and identification of factors affecting the output of the model i...
متن کاملBLSTM neural networks for speech driven head motion synthesis
Head motion naturally occurs in synchrony with speech and carries important intention, attitude and emotion factors. This paper aims to synthesize head motions from natural speech for talking avatar applications. Specifically, we study the feasibility of learning speech-to-head-motion regression models by two types of popular neural networks, i.e., feed-forward and bidirectional long short-term...
متن کاملA hybrid EEG-based emotion recognition approach using Wavelet Convolutional Neural Networks (WCNN) and support vector machine
Nowadays, deep learning and convolutional neural networks (CNNs) have become widespread tools in many biomedical engineering studies. CNN is an end-to-end tool which makes processing procedure integrated, but in some situations, this processing tool requires to be fused with machine learning methods to be more accurate. In this paper, a hybrid approach based on deep features extracted from Wave...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013